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Abstract. We consider user-private information retrieval (UPIR), an interesting alternative to private 
information retrieval (PIR) introduced by Domingo-Ferrer et al. In UPIR, the database knows which 
records have been retrieved, but does not know the identity of the query issuer. The goal of UPIR is to 
disguise user profiles from the database. Domingo-Ferrer et al. focus on using a peer-to-peer community 
to construct a UPIR scheme, which we term P2P UPIR. In this paper, we establish a strengthened 
model for P2P UPIR and clarify the privacy goals of such schemes using standard terminology from 
the field of privacy research. In particular, we argue that any solution providing privacy against the 
database should attempt to minimize any corresponding loss of privacy against other users. We give an 
analysis of existing schemes, including a new attack by the database. Finally, we introduce and analyze 
two new protocols. Whereas previous work focuses on a special type of combinatorial design known as 
a configuration, our protocols make use of more general designs. This allows for flexibility in protocol 
set-up, allowing for a choice between having a dynamic scheme (in which users are permitted to enter 
and leave the system), or providing increased privacy against other users. 

1 Introduction 

We consider the case of a user who wishes to maintain privacy when requesting documents from 
a database. One existing method to address this problem is private information retrieval (PIR). 
In PIR, the content of a given query is hidden from the database, but the identity of the user 
making the query is not protected. In this paper, we focus on an interesting alternative to PIR 
dubbed user-private information retrieval (UPIR), introduced by Domingo- Ferrer et al. [3]. UPIR, 
however, is only nominally related to PIR, in that it seeks to provide privacy for users of a database. 
In UPIR, the database knows which records have been retrieved, but does not know the identity of 
the person making the query. The problem that we address, then, is how to disguise user profiles 
from the point of view of the database. 

We draw some of our terminology from Pfitzmann [6j. Here we understand anonymity as the 
state of not being identifiable within a set of subjects, and the anonymity set is the set of all 
possible subjects. By untraceable queries from the point of view of the database, we mean that the 
database cannot determine that a given set of queries belongs to the same user. One interesting 
caveat, which is addressed below, is that a set of queries might be deemed to come from the same 
user based on the subject matter of those queries. If the subject matter of a given set of queries is 
esoteric or otherwise unique, the database (or some other adversary) can surmise that the identity 
of the source is the same for all (or most) queries in this set; we call such a set of queries linked. 
In the case of linked queries, we wish to provide as much privacy as possible, in the sense that 
we wish the database to have no probabilistic advantage in guessing the identity of the source of 
a given set of linked queries. In this way, we can say the user making the linked queries still has 
pseudonymity — his identity is not known. 
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With this terminology in mind, we might better explain UPIR as a method of database query- 
ing that is privacy-preserving and satisfies the following properties from the point of view of the 
database: 

1. For any given user Ui, some (large) subset of all users U is the query anonymity set for Uf, 

2. User queries are anonymous; 

3. User queries are untraceable; 

4. Given a set of queries that is unavoidably traceable due to subject matter, the person making 
the query is protected by pseudonymity. 

In addition to these basic properties of user-privacy against a database, we may wish to provide 
user-privacy against other users. Ideally, a UPIR scheme would provide the same privacy guarantees 
against other users as against the database, but we will see that this usually cannot be attained in 
practice. 

Previous work [2|3|8|9| has focused on the use of a P2P network consisting of various encrypted 
"memory spaces" (i.e., drop boxes), to which users can post their own queries, submit queries to 
the database and post the respective answers, and read answers to previously posted queries. That 
is, in the P2P UPIR setting, we have a cooperating community of users who act as proxies to 
submit each other's queries to the database. In particular, a class of combinatorial designs known 
as configurations have been suggested by Domingo- Ferrer, Bras-Amoros et al. [2 3 8 9j as a way to 
specify the structure of the P2P network. In this work, we focus on P2P UPIR and consider the 
application of other types of designs in determining the structure of the P2P network. We introduce 
new P2P UPIR protocols and explore the level of privacy guarantees our protocols achieve, both 
against the database and against other users. 

1.1 Our Contributions 

The main contributions of our work are as follows. 

— We establish a strengthened model for P2P UPIR and clarify the privacy goals of such schemes 
using standard terminology from the field of privacy research. 

— We provide an analysis of the protocol introduced by Domingo- Ferrer and Bras-Amoros |2|3| . 
as well as its subsequent variations. In particular, we reconsider the choice to limit the designs 
used as the basis for the P2P UPIR scheme to configurations. We provide a new attack on 
user-privacy against the database, which we call the intersection attack, to which the above 
protocol variations are vulnerable. 

— We introduce two new P2P UPIR protocols (and variations on these), and give an analysis of 
the user-privacy these protocols provide, both against the database and against other users. 
Our protocols utilize more general designs and resist the intersection attack by the database. 
In particular, our protocols provide more flexibility in designing the P2P network. 

— We consider the possible trade-offs of using different types of designs in the P2P UPIR setting, 
both with respect to the overall flexibility of the scheme as well as user-privacy. Our protocols 
provide viable design choices, which can allow for a dynamic UPIR scheme (i.e., one in which 
users are permitted to enter and leave the system), or provide increased privacy against other 
users. 

— We consider the problem of user-privacy against other users in detail. In particular, we relax 
the assumptions of previous work, by allowing users to collaborate outside the parameters of 
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the P2P UPIR scheme; that is, we consider a stronger adversarial model than previous work. 
We analyze the ability of different types of designs to provide user-privacy against other users, 
and explore how well our protocol resists an intersection attack launched by a coalition of users 
on linked queries. Finally, we introduce methods to improve privacy against other users without 
compromising privacy against the database. 

We now give an outline of our paper. In Section [21 we give a model for P2P UPIR schemes and 
provide the relevant privacy goals. Section [3] provides background information on designs. We then 
review previous work in Section [J] and give attacks on these protocols in Section 14.11 We introduce 
our protocols in Section [5] and give an analysis of the privacy guarantees our protocols provide 
against the database. In Section [6l we analyze the ability of our protocols to provide user-privacy 
against other users and consider ways to improve this type of privacy. We conclude in Section 

2 Our P2P UPIR Model 

A P2P UPIR scheme consists of the following players: a finite set of possible users U = {U±, . . . , U v }, 
the target database DB, and an external observer, O. We assume all communication in a P2P UPIR 
scheme is encrypted, including communication between the users and DB. 

In a basic P2P UPIR scheme, users have access to secure drop boxes known as memory spaces. 
More precisely, a memory space is an abstract (encrypted) storage space in which some subset of 
users can store and extract queries and query responses; the exact structure of these spaces is not 
specified. We let S = {Si, . . . , S^} denote the set of memory spaces, and we let Ki denote the 
(symmetric) key associated with Si, for 1 < i < b. We assume that encryption keys for memory 
spaces are only known to a given subset of users, as specified by the P2P UPIR protocol. For the 
sake of simplicity, we assume that these keys are initially distributed in a secure manner by some 
trusted external entity (not the database DB). However, the precise method by which these keys are 
distributed is not relevant to the results we prove in this paper. If two distinct users Ui, Uj £ U have 
access to a common memory space, then we say Ui and Uj are neighbors. Similarly, the neighborhood 
of a user Ui is defined as the set of all neighbors of Ui, and it is denoted as N(Ui). 

When a user Ui wishes to send a query q to DB, we say Ui is the source of the query. Rather 
than sending the query directly to DB, Ui writes an encrypted copy of q, together with a requested 
proxy Uj, to a memory space Sg. Here Uj is the proxy for t/i's query q, and consequently Uj must 
know the encryption key Kg corresponding to the memory space Sg. The user Uj decrypts the query, 
re-encrypts q under a secret key shared with DB, say K :I DB , and forwards this re-encrypted query 
e K j (q) to DB. DB sends back a response, which Uj first decrypts, then re-encrypts under Kg and 
records in the memory space Sg. We give a schematic of the information flow of a basic P2P UPIR 
scheme in Figure UJ 

2.1 Attack Model 

We consider each type of player as a possible adversary A. We assume that A has full knowledge 
of the P2P UPIR scheme specification, including any public parameters, as well as any secret 
information assigned to A as part of the P2P UPIR scheme. In addition, we assume A does not 
conduct traffic analysis. The following definition will be useful. 

Definition 2.1. Consider a set of one or more users C . The query sphere for C is the set of 

memory spaces that C can (collectively) access via the P2P UPIR scheme. 
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Fig. 1. Schematic of Information Flow 
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In addition to the above, we make the following assumptions about each specific type of adver- 
sary A: 

— Suppose A is the database DB. As stated above, we assume that DB does not observe infor- 
mation being posted to or read from memory spaces. In addition, we assume that DB does not 
collaborate with any users and answers queries honestly. We note DB necessarily observes the 
content of all queries and the proxy of each query. 

— Suppose A consists of a user or a subset of colluding users C C U. We assume users are 
honest-but-curious. Users in C can communicate outside of the given P2P UPIR scheme and 
collaborate using joint information. The users of C can see the content of any queries within 
C's query sphere, but cannot identify the original source of these queries. 

— Suppose A is an external adversary O. An external observer O can see the encrypted content of 
memory spaces. We consider the possibility of key leakage as the main attack launched by O. 
This refers to a party gaining access to a memory space key outside of the P2P UPIR scheme 
specification (e.g., by social engineering or other means). 

Although we do not specifically treat traffic analysis as an attack, we wish to avoid a trivial 
analysis of traffic entering and leaving a given memory space. That is, we assume that memory 
spaces have encryption and decryption capabilities, so that a user acting as a proxy may decrypt 
and re-encrypt a given query within its associated memory space, before forwarding the query to 
the database. 

2.2 Privacy and Adversarial Goals in P2P UPIR 

In considering the privacy guarantees for a user Ui, we assume either the database DB or a group 
of other users may try to determine whether Ui is the source of a given set of queries, or try to 
establish whether or not a given set of queries originates from the same source. We will need the 
following definition: 

Definition 2.2. We say two or more queries (71,(72, ■ ■ ■ are linked if, given the subject matter, one 
can infer that the queries are likely to be from the same source. 

We also consider the possibility of an external observer O gaining information that compromises 
the privacy of Ui. We recognize the following goals for U^s privacy: 

— Confidentiality: the content of C/j's queries is protected; 
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— Anonymity: the identity of a query source is protected; 

— Untraceability: a user's query history cannot be reconstructed as having originated from the 
same user; 

— Pseudonymity in the presence of linked queries: given a set of linked queries, the identity of the 
source is protected. 

We now analyze each type of adversary A with respect to the above privacy goals: 

— Suppose A is the database DB. We are not concerned with confidentiality against DB, but 
rather anonymity, untraceability, and pseudonymity in the presence of linked queries. The goal 
of the database is to create a profile of Ui. That is, the database would like to establish the set 
of queries for which Ui is the source. The database also attempts to trace user query histories; 
that is, DB would like to establish that a given set of queries came from the same source, even 
if DB cannot determine the identity of the source. 

— Suppose A consists of a user or a subset of colluding users C C U. The coalition C collaborates to 
try to determine the query history of another user Ui £ C. Here we are interested in maintaining 
anonymity, untraceability, and pseudonymity in the presence of linked queries against C. We 
are also interested in maintaining confidentiality, in the sense that C should not have access to 
the content of queries outside the query sphere for C. 

— Suppose A is an external adversary O. The goal of O is to compromise both the confidentiality 
and the anonymity of Ui . External adversaries may try to compromise the encryption mechanism 
of the memory spaces. 

We are now almost ready to consider the P2P UPIR protocols of Domingo- Ferrer, Bras-Amoros 
et al. |2|3j . as well as the subsequent modification of Stokes and Bras-Amoros [8 9j. Both these 
protocols and ours, however, draw heavily from the field of combinatorial designs. In the next 
section, we introduce the requisite background knowledge on combinatorial designs. 

3 Background on Designs 

For a general reference on designs, we refer the reader to Stinson [TJ. 

Definition 3.1. A set system is a pair (X,A) such that the following are satisfied: 

1. X is a set of elements called points, and 

2. A is a collection (i.e., multiset) of nonempty proper subsets of X called blocks. 

In the rest of this section, we abuse notation by writing blocks in the form abc instead of {a, b, c}. 

Definition 3.2. The degree of a point x € X is the number of blocks containing x. If all points 
have the same degree, r, we say (X,A) is regular (of degree r). 

Definition 3.3. The rank of (X, A) is the size of the largest block. If all blocks contain the same 
number of points, say k, then (X,A) is uniform (of rank k). Note that k < v. 

Definition 3.4. A covering design is a set system in which every pair of points occurs in at least 
one block. 
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Example 3.1. A covering design. 

X = {1, 2, 3, 4, 5, 6, 7} and A = {13, 23, 157, 124, 347, 356, 2567, 14567}. 

Definition 3.5. A pairwise balanced design (or PBDJ is a set system such that every pair of 
distinct points is contained in exactly A blocks, where X is a fixed positive integer. Note that any 
PBD is a covering design. 

Example 3.2. A PBD with A = 2. 

X = {1, 2, 3, 4, 5} and A = {12, 25, 135, 145, 1234, 2345}. 

Definition 3.6. Let (X, A) be a regular and uniform set system of degree r and rank k, where 
\X\ = v and \A\ = b. We say (X, A) is a (v,b,r,k)-l design. 

Example 3.3. A (5,5,3,3)-l design. 

X = {1, 2, 3, 4, 5} and A = {123, 451, 234, 512, 345}. 

Definition 3.7. A (v, b, r, k, A)-balanced incomplete block design (or BIBDJ is a (v,b,r,k)-l de- 
sign in which every pair of points occurs in exactly A blocks. Equivalently, a (v,b,r, k, \)-BIBD is 
a PBD that is regular and uniform of degree r and rank k. 

Example 34. A (10, 15, 6, 4, 2)-BIBD. 

X = {0,1, 2, 3, 4, 5, 6, 7, 8, 9} 

and 

A = {0123, 0147, 0246, 0358, 0579, 0689, 1258, 1369, 
1459, 1678, 2379, 2489, 2567, 3478, 3456}. 

Definition 3.8. A (v , b, r, fc)-configuration is a (v,b,r, k)-l design such that every pair of distinct 
points is contained in at most 1 block. 

Example 3.5. A (9, 9, 3, 3)-configuration. 

X = {1, 2, 3, 4, 5, 6, 7, 8, 9} and A = {147, 258, 369, 159, 267, 348, 168, 249, 357}. 

Remark 3.1. A (v, b, r, /^-configuration with v = r(k — 1) + 1 is a (v, b, r, k, 1)-BIBD. A (v, b, r, k, 1)- 
BIBD with parameters of the form (n 2 + n + l,n 2 + n + l,n+l,n + l,l) is a finite projective plane 
of order n. 

Definition 3.9. A symmetric BIBD is a BIBD in which b = v. 

Remark 3.2. A projective plane is a symmetric BIBD. 

Theorem 3.1. (Fisher's Inequality) In any (v,b,r,k, X)-BIBD, b>v. 

Theorem 3.2. In a symmetric BIBD any two blocks intersect in exactly A points. 
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3.1 P2P UPIR using Combinatorial Designs 

We model a P2P UPIR scheme using a combinatorial design. That is, we consider pairs (U, S) 
(where as before \U\ = v and |<S| = b ), such that each memory space, or block, consists of k users 
and each user, or point, is associated with r memory spaces. That is, we assume that the pair (U, S) 
is a (v, b, r, k)-l design. 

We can also view the b memory spaces as points and define v blocks, each of which contains 
the memory spaces to which a given user belongs. This yields the dual design (S,U), which is a 
(b, v, k, r)-l design. 

4 Previous Work: Using Configurations 

We briefly review the P2P UPIR scheme proposed by Domingo-Ferrer et al. [3] and the proposed 
modification of Stokes and Bras-Amoros [9]. We fix a (v , b, r, fc)-configuration (U,S). As before, 
we have a finite set of users IA = {Ui, . . . ,U V }, a database DB, and a finite set of memory spaces 

5 = {Si, . . . , Sb}- 

Each user has access to r memory spaces, and each memory space is accessible to k users. Each 
memory space is encrypted via a symmetric encryption scheme; for each memory space, only the k 
users assigned to that memory space are given the key. The following protocol [3] assumes the user 
Ui has a query to submit to the database: 

Protocol 1 Domingo-Ferrer-Bras-Amoros-Wu-Manjon (DBWM) Protocol 
We fix a (v, b, r, k)- configuration. 

1. The user Ui randomly selects a memory space Si to which he has access 

2. The user Ui decrypts the content on the memory sector Se using the corresponding key. His 
behavior is then determined by the content on the memory space as follows: 

(a) The content is garbage. Then Ui encrypts his query and records it in Sg. 

(b) The content is a query posted by another user. Then Ui forwards the query to the database 
and awaits the answer. When Ui receives the answer, he encrypts it and records it in Si. He 
then restarts the protocol with the intention to post his query. 

(c) The content is a query posted by the user himself. Then Ui does not forward the query to 
the database. Instead Ui restarts the protocol with the intention to post his query. 

(d) The content is an answer to a query posted by another user. Then Ui restarts the protocol 
with the intention to post his query; 

(e) The content is an answer to a query posted by the user himself. Then Ui reads the query 
answer and erases it from the memory space. Subsequently Ui encrypts his new query and 
records it in Si. 

The modification proposed by Stokes and Bras-Amoros [9] replaces l2c1 as follows: 

Protocol 2 DBWM-Stokes (DBWMS) Protocol 

2' (c). If the content is a query posted by the user himself, then Ui forwards the query to the database 
with a specified probability p. If Ui forwards the query to the database, he records the answer in 
Si- The user Ui restarts the protocol with the intention to post his current query. 
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Remark 4-1- This protocol is ambiguous as stated by Stokes and Bras-Amoros. The intent of the 
system is that users periodically run the protocol with "garbage" queries, in this way collecting the 
answers to their previous queries. 

Stokes and Bras-Amoros [8 9j argue that the finite projective planes are the optimal configura- 
tions to use for P2P UPIR. Their argument is that privacy against the database is an increasing 
function of r(k — 1), since there are r(k — 1) users in the anonymity set of any given user Ui. That is, 
the query profile of U is diffused among r{k — 1) other users in the neighborhood of Ui. Now, since 
r{k — 1) < v — 1 in a configuration, the authors consider configurations satisfying r(k — 1) = v — 1, 
which yield the finite projective planes. In our protocols, introduced in Section [5l we also have 
neighborhoods of maximum size, without limiting ourselves to configurations. We also ensure that 
the database DB has no advantage in guessing the identity of the source of any given query. 

4.1 Attacks 

We consider the privacy properties of the DBWM and DBWMS protocols with respect to the 
database, before offering an improved protocol in Section We fix a {v , b, r, fc)-configuration, where 
v is the number of users and b is the number of memory spaces. We associate a block with each 
memory space, where the block consists of the users that have access to the memory space. 

The weakness of the DBWM and DBWMS protocols lies in the possibility of a user's query 
history being identifiable as originating from one user. That is, if a series of queries is on some 
esoteric subject, the adversary (such as the database) can surmise that the source of these queries 
is the same. As before, we refer to such queries as linked. 

Stokes and Bras-Amoros [9j noticed a weakness in the DBWM protocol when a projective plane 
is used as the configuration, that is, when v = r{k — 1) + 1. In this case, each user Ui has a 
neighborhood consisting of all other users. Then, given a large enough set of linked queries, the 
only user who never submits one of these linked queries is the source, Ui. That is, the database 
can eventually identify Ui as the source. Subsequent to our research, Stokes and Bras-Amoros [10] 
noted this weakness applies more generally to {v , k, l)-BIBDs. 

We introduce another type of attack, which we call the intersection attack. This attack only 
applies to configurations satisfying v > r(k — 1) + 1, as it requires that all users have neighborhoods 
of cardinality less than v — 1. The idea behind the intersection attack is that, given a query gi 
submitted by proxy Uj, an attacker can, by analyzing the neighborhood of Uj, compute a list of 
possible sources Q\. If the attacker has access to a set of linked queries qi, q 2 , ■ ■ ■ , q n , and the 
neighborhoods of these users do not consist of all users in the system, the intersection of the 
possible source sets Q%, Q 2 , ■ ■ ■ , Q n can perhaps identify the source (or narrow down the list of 
possible sources). We demonstrate this attack in the following example. 

Example 4.1. Suppose v = 12 and 6 = 8 and we have the following blocks (memory spaces): 

{U U U 2 ,U 3 } {U 4 ,U 5 ,U 6 } {U 7 ,U 8 ,U 9 } {U 10 ,Un,U 12 } 
{Ut,U 4 , U 7 } {U 2 , U s , U w } {U 3 , U 8 , U u } {U e , U 9 , U 12 } 

Note this is a (12, 8, 2, 3)-configuration. We consider the DBWM protocol here; that is, we 
assume that the proxy of a given query is always different from the source of the query. Now 
suppose three queries are transmitted from users U 2 ,U\\, and U%. 

— If the proxy is U 2 , then the source Ui £ {U\, Us, U5, Uiq}. 
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— If the proxy is U\\, then the source Ui G {U 3 , Us, U±o, U^}- 

— If the proxy is Us, then the source Ui G {U 3 , Uj, Ug, U\i}. 

Suppose that the subject of the queries is similar, so it can be inferred that the source of the three 
queries is probably the same user. Then it is easy to identify the source of the queries: 

Ui g {u u u 3 , u 5 , u 10 } n {u 3 , Us, u w , u 12 } n {u 3 , u 7 , u 9 , u u }, 

so Ui = U 3 . Clearly, user-privacy with respect to the database is not achieved here. 

We do not claim that the above-described attack will always work for any configuration; it 
is easy to come up with examples where the attack does not work. For example, suppose that 
N{Ui) U {U} = N(Uj) U {Uj} for two distinct users Ui and Uj. Then it would be impossible for 
DB to determine whether U or Uj is the source of a sequence of linked queries. Independently of 
this research, Stokes and Bras-Amoros [10] noted that by choosing the configuration carefully, it 
is possible to ensure the neighborhood-to-user mapping is not unique, and to guarantee a specified 
lower bound on the number of possible users for a given neighborhood. 

Observe that the intersection attack is not useful when one uses a finite projective plane as the 
configuration and users are allowed to submit their own queries. This follows because, at each stage 
of the intersection attack, the set of possible sources includes all users in the set system. In the 
next section, we formalize this observation and discuss the use of more general types of designs in 
P2P UPIR protocols that resist the intersection attack in a very strong sense. 

5 Using More General Designs 

As observed in Section U in order to achieve user-privacy with respect to the database, we need to 
allow users to sometimes transmit their own queries. We suggest a different solution to the problem 
than that given by Bras-Amoros et al., however. In particular, we see no reason to limit the P2P 
network topology to configurations. Bras-Amoros et al. indicate use of configurations as a method 
to increase service availability and decrease the number of required keys. Indeed, configurations 
were proposed as key rings in wireless sensor networks by Lee and Stinson [5] due to memory 
constraints of sensor nodes. However, storage constraints are not so much an issue in P2P UPIR. 
We therefore consider the possibility of using other types of designs. 

We will make use of memory spaces that "balance" proxies for every source. We suggest to use 
a balanced incomplete block design (BIBD) for the set of memory spaces. We will show that these 
designs provide optimal resistance against the intersection attack. 

Our scheme also differs from DBWMS in the treatment of proxies. In the previous schemes, the 
identity of a proxy was not specified by the source. Queries were simply forwarded to the database 
by whichever user had most recently checked the corresponding memory space. We propose that 
each source designates the proxy for each query. This enables us to balance the proxies for each 
possible source, thereby providing "perfect" anonymity with respect to the database. Moreover, we 
do not assume that each memory space holds only a single query; rather, we assume that memory 
spaces are capable of storing multiple queries. 

Protocol 3 Proxy-designated BIBD Protocol (Version 1) 

We fix a (v,b,r,k, X)-BIBD. To submit a query, a user Ui uses the following steps: 

1. With probability 1/v, user Ui acts as his own proxy and transmits his own query to the DB. 
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2. Otherwise, user Ui chooses uniformly at random one of the r memory spaces with which he is 
associated, say St, and then he chooses uniformly at random a user Uj € St\{Ui}. Finally, user 
Ui requests that user Uj act as his proxy using the memory space St- 

Protocol 4 Proxy-designated BIBD Protocol (Version 2) 

We fix a (v, b, r, k, X)-BIBD. To submit a query, a user Ui uses the following steps: 

1. With probability 1/v, user Ui chooses to act as his own proxy. User Ui then writes the query 
uniformly at random to one of the r memory spaces with which he is associated, and transmits 
his own query to DB. 

2. Otherwise, user Ui chooses uniformly at random one of the r memory spaces with which he is 
associated, say St, and then he chooses uniformly at random a user Uj € Se\{Ui}. Finally, user 
U, requests that user Uj act as his proxy using the memory space St- 

Remark 5.1. We note that Protocol |4] differs from Protocol [3] only in the first step. 

Remark 5. 2. We assume users check memory spaces regularly and act as proxies as requested within 
a reasonable time interval. 

Remark 5.3. We make the assumption that, when a source U requests Uj to be his proxy, everyone 
in the associated memory space knows that this request has been made, but no one (except for Ui) 
knows the identity of the source. 

Remark 5.4- The choice between Protocol [3] and Protocol S] impacts the amount of privacy the 
scheme provides against other users. This will be discussed in Section [6l 

We analyze the situation from the point of view of the database. For the rest of the paper, we 
let variables S,P,M be random variables for source, proxy, and memory space, respectively. 

Theorem 5.1. From the point of view of the database, the Proxy- designated BIBD Protocols (Pro- 
tocols^ anrfgp satisfy Pr[S = U\P = Uj] = Pr[S = U] for all U, Uj e U. 

Proof. First, the schemes ensure that Pr[P = Uj\S = Ui] = ^ for all U, Uj. To see this, first note 
that Ui will pick himself as the source with probability -. In Protocol El U will then submit his 
query directly to the database. In Protocol [H Ui will pick one of the r memory spaces with which 
he is associated uniformly at random and then act as his own proxy. So in both cases, we have 

Pr[P = C7i|S = U] = -. 

v 

Then in both protocols, with probability user U will pick a memory space St (with Ui € Sg) 
uniformly at random, followed by a proxy Uj associated with Sg. The probability that a fixed Uj 
with i ^ j will act as proxy can be computed as follows. 

For i ^ j, we have 

Pr[P = Uj\S = U] = ^— !- ^ Pr[M = S e ]Pr[P = Uj\M = S e ] 

_v-l s-^ 1 fv-l\ / A \ _ 1 
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Similarly, we can see that Pr[P = Uj] = 1/v for all Uj G IA: 



Pr[P = C/ i ]= Pt[M = S £ ]Pr[P = Uj\M = S e ] 



r 



1 



bk 



v 



Now we have 



Pr[S = Ui\P = Uj] 



Pr[P = Uj\S = Ui]Pr[S = U] 
Pr[P = Uj] 



Pr[S 




so the identity of the proxy gives no information about the identity of the source. 

We observe that this analysis is independent of any computational assumptions, so the security 
is unconditional. Since we have achieved a perfect anonymity property, it follows that no information 
is obtained by analyzing linked queries. 

Example 5.1. To illustrate, consider a projective plane of order 2 with the following blocks: 



We note that this is a (7,3,3,3, 1)-BIBD. Suppose that the first query uses block {U2, U4, £/g} 
with proxy U4, and the second query uses block {U2, U5, U 7 } with proxy U%. From the first query, 
DB knows that one of three blocks were used: {Ui, U4, U5}, {U2, U4, Uq}, or {U3, U4, U 7 }. However, 
Pr[«S = Ui\P = U4] = Pr[5 = Ui] for all possible sources Ui, so DB has no additional information 
about the identity of the source, given that P = U4. From the second query, DB knows that one 
of three blocks were used: {Ut, U 2 , U 3 }, {U 2 , U4, U 6 }, or {U 2 , U 5 , U 7 }. Again, Pr[5 = Ui\P = U 2 ] = 
~Pt[S = Ui] for all possible sources Ui, so DB has no additional information about the identity of the 
source, given that P = U 2 . So even if DB suspects that both queries came from the same source, 
he has no way to identify the source. 

5.1 Extensions 

We can consider using less structured designs than BIBDs, such as pairwise balanced designs or 
covering designs. It turns out that we can still achieve perfect anonymity with respect to DB, 
because our anonymity argument remains valid provided that Pr[P = Uj\S = Ui] = - for all 



We next give a generalized protocol based on an arbitrary covering design. That is, we do not 
require constant block size k or constant replication number r. 

Protocol 5 Proxy-designated Covering Design Protocol (Version 1) 

We fix a covering design. To submit a query, a user Ui performs the following steps: 

1. User Ui chooses the designated proxy Uj uniformly at random. If Ui = Uj, then Ui submits his 
query directly to DB and skips SteplM 

2. If Ui 7^ Uj, then user Ui chooses uniformly at random one of the memory spaces that contains 
both Ui and Uj, say Sg. Then Ui requests that user Uj act as his proxy using memory space Si. 



{Ui, U 2 , U 3 } {Ui,U 4i U 5 } {Ui,U 6 , U 7 } {U 2 , Ui, U 6 } 
{U 2 , U 5 , U 7 } {U 3 , U4, U 7 } {U 3 , U 5 , U 6 } 



U,Uj G hi. 
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Protocol 6 Proxy-designated Covering Design Protocol (Version 2) 

We fix a covering design. To submit a query, a user U{ performs the following steps: 

1. User Ui chooses the designated proxy Uj uniformly at random. (The user f/, may choose himself 
as the proxy Uj.) 

2. User Ui chooses uniformly at random one of the memory spaces that contains both Ui and Uj, 
say Se. Then Ui requests that user Uj act as his proxy using memory space Si. 

Remark 5.5. If the covering design is a BIBD, then Protocol [5] is equivalent to Protocol [3] and 
Protocol M is equivalent to Protocol [H 

Remark 5.6. We must have a covering design to ensure that a suitable memory space Si always 
exists in Step [2] of Protocols [5] and [6j 

Remark 5.7. As in Protocols [3] and [H we assume users check memory spaces regularly, and act as 
proxies as requested within a reasonable time interval. We also assume, as before, that when source 
Ui requests that Uj ^ Ui be his proxy, everyone in the associated memory space knows that this 
request has been made, but no one (except for Ui) knows the identity of the source. 

Theorem 5.2. From the point of view of the database, for a given query, the Proxy- designated 
Covering Design Protocols (Protocols^ and\B$ satisfy Pr[S = Ui\P = Uj] = Pr[S = Ui] for all 
Ui.U, i U. 

Proof. Step Q] of both Protocol [5] and Protocol [6] ensures that Pr[P = Uj\S = Ui] = I for all 
Ui,Uj £W. Similarly, we can see that Pr[P = Uj] = - for all Uj. We once again have 

pr[s = ^ = ^ = Prg - u^u^ts - uj = pr[s = m 

so the identity of the proxy gives no information about the identity of the source. 

As before, we observe that this analysis is independent of any computational assumptions, so 
the security is unconditional. Since we have achieved a perfect anonymity property, no information 
is obtained by analyzing linked queries. 

5.2 Dynamic P2P UPIR Schemes 

One benefit of using less structured designs than BIBDs is that the scheme can be dynamic. That 
is, we can add and remove users, which allows greater flexibility in practice. 

To delete a user Ui from Protocols [5] and [6j we simply remove Ui from all the memory spaces 
with which he is associated. To avoid Ui from reading any more queries written to these memory 
spaces, we also need a rekeying mechanism to update the associated keys. The same external entity 
that distributed the initial set of keys could be responsible for rekeying. The end result is a covering 
design with one fewer users than before. 

To add a user U ncw in Protocols [5] and [6l we may use the following method. We first find 
•M = {S^, ■ ■ ■ i She} — $ such that U • • • U Sh e = U. That is, we need a set of memory spaces 
whose union contains all current users. A greedy algorithm could be used to accomplish this task, 
although the resultant set Ai would likely not be optimal. Indeed, finding the minimum such set is 
NP-hard. (This is the minimum cover problem, which is problem SP5 in Garey and Johnson [1].) 
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Once we have identified a suitable set Ai, we simply add U new to each memory space in Ad, 
and give U new the associated keys. In addition, we need a mechanism by which to inform all users 
of C/new's presence in the scheme. The resulting set system is still a covering design — one which 
contains one more user than before. 

6 Privacy Against Other Users 

In this section, we consider our Protocols EJ HJ [5j and [6] in the context of analyzing user-privacy 
against other users. We remind the reader of Remarks 15. 3l and l5.7t we assume that when a source U 
requests that Uj be his proxy, everyone in the associated memory space knows that this request has 
been made, but no one (except for Ui) knows the identity of the source. We now analyze the privacy 
of a given user relative to other users of the scheme. As we will see, if we wish to provide privacy 
against other users, a design that has more structure than a general covering design becomes useful. 
In particular, we will observe that the use of a regular PBD (see Definition [33]) in Protocols [5] and [6] 
is desirable. 

It is helpful to begin with an example: 

Example 6.1. Consider the projective plane from Example 15.11 and suppose we use Protocol [3j 
Suppose that user P = U4 is requested to make a query in memory space {U\, E/4, U5} by source 
S = U\. User U4 knows that the source must be U\ or L% (since he did not make the request 
himself). User U5, however, knows that the source must be U\ because 

1. C/g did not make the request himself, and 

2. Ui would not post a request to himself to transmit a query — he would just go ahead and transmit 



We can generalize the concept from Example 16. 1L Observe that in Protocols [3] and El the 
requested proxy can rule out one possible source, and anyone else in the memory space (who is 
not the source) can rule out two possible sources. If we consider Protocols [4] and EJ then users 
can rule out only one possible source (namely, themselves). That is, Protocols H] and [6] improve 
the information theoretic privacy guarantees of the scheme with respect to the viewpoint of other 
users. However, we remark that in these versions, when a source acts as his own proxy, other users 
associated with the chosen memory space can see the content of the query. In Protocols [3] and [5j if a 
user Ui is both the source and proxy of a given query, then Ui is the only user who sees the content 
of that query. Hence it may still be desirable to use Protocols [3] and [5j if additional confidentiality 
is required. 

An interesting related question is, when a particular user Ut sees a query q posted to the memory 
space Si that is not his own, whether or not Ut has a probabilistic advantage in guessing the source 
of q. The following theorems show that, in order to minimize any such advantage, it is helpful to 
use a regular PBD in our protocols. 

Theorem 6.1. Let (X, A) be a regular PBD of degree r. Assume (X,A) is used in the Proxy- 
designated Covering Design Protocol (Protocol^) and assume that Pr[S = Ui] = l for all Ui e U. 
Suppose Ut € Sg sees a query q posted to Sg that is not his own. Then, from the point of view of 
Ut, for a given query q and Ui, Uj £ Sg such that i ^ t, it holds that 



it himself. 
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Proof. We first note that the protocol definition ensures that when i = j, we have Pr[S = t/j|M 
S t ,P = U j ]=Q. 

We now consider the case i ^ j. We set Ajj = |{5 g |f7j, Uj G = A. Thus, we have 

Pr[M = 5/,P = £/,-|S = I7i] = Pr[M = 5^|P = t/,-, S = C/^PrfP = E7,-|S = U t ] 

1 1 



Ajjf Af 



Then because i ^ t,j, we have Pr[S = U{] = an d 
Pr[S = Ui\M = S e ,P = Uj 



Pr[S = [/,;]Pr[M = S £ ,P = Uj\S = Ui] 



Pr[M = S e ,P = Uj] 
Pr[S = L/i]Pr[M = S e ,P = Uj\S = U t ] 
Y.u h& S t Pr[S = U h ]Pi[W = S e ,P = U 3 \S = U h ] 

v{v-2)X 1 



as desired. 



Theorem 6.2. Let (X, .A) 6e a regular PBD of degree r. Assume (X,A) is used in the Proxy- 
designated Covering Design Protocol (Protocol^ and assume that Pr[S = Ui] = ~ for all Ui G U. 
Suppose U t G Si sees a query q posted to S? that is not his own. Then, from the point of view of 
Ut, for a given query q and Ui, Uj G St such that i ^ t, it holds that 



\+r(\S e \-2) i ~ 3 
\+r{\S t \-2) ifi^j- 



Pt[S = Ui\M = S e ,P = Uj] 



Proof. We first calculate Pr[M = Se,P = Uj\S = Ui], where Ui G Si. We again set Ajj 
|{5 g |{7j, Uj G S q }\. Since (X, A) is a PBD of degree r, we have 

_ j r if i = j 
ij I A if / / j. 

We have 

Pr[M = Si,P = Uj\S = Ui] = Pr[M = S e \P = Uj, S = C/ 4 ]Pr[P = Uj\S = U t ] 

l A (V 
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Then because i ^ t, we have Pr[S 



Ui] = — and 

Pr[S = ^]Pr[M = S e ,P = Uj\S = U t ] 



Pr[S = Ui\M = S e ,P = Uj] 



Pr[M = S e ,P = Uj] 



Pr[S = C/i]Pr[M = S e ,P = Uj\S = U { ] 



J2u h& s e Pr[S = U h ]Pv[M = S e ,P = Uj\S = U h ] 



l 



v{v— l)Ay 




1 



hGS f „(„_l)A hj . 



1 



which yields the desired result. 

Remark 6.1. Theorems 16.11 and 16.21 apply to Protocols [3] and HI respectively, since a BIBD is also a 
PBD that is regular of degree r. 

Theorems 16.11 and 16 . 21 demonstrate that the use of a regular PBD increases privacy against other 
users. This is because, from the point of view of another user, the possible source distribution is 
closer to uniform. 

Theorem 16.11 implies that for Ut € St, if Ut sees a query q with proxy Uj posted to Si that is not 
his own, any of the remaining \Si\ — 2 users in Si are equally likely to be the source. If Protocol [6] 
is used instead of Protocol El then Ut can no longer completely eliminate the possibility of the 
proxy Uj being the source. However, as Theorem 16.21 shows, the likelihood of the proxy Uj being 
the source is not the same as the likelihood of Ui ^ Uj being the source. Indeed, it is far less likely 
that Uj is acting as both proxy and source for q in this situation. Intuitively, if a user U{ is acting 
as both source and proxy, he has r possible memory spaces to choose from, whereas if Ui chooses 
another user Uj as proxy, he has only A many memory spaces to choose from. 

6.1 Linked Queries and Coalitions of Users 

Users can also launch an intersection attack against a series of linked queries, similar to the inter- 
section attack launched by DB against the DBWM and DBWMS protocols (Protocols [1] and [2]) . 
The difference here is that users have access to the content of queries via the shared memory spaces; 
that is, users of a given memory space know which queries have been posted to that memory space, 
whereas the database only knows the identity of the proxy. 

Example 6.2. Consider the projective plane from Example 15.11 and suppose we use Protocol [U 
Suppose that U\ is the source of two linked queries, where the first query uses memory space 
{Ui, Ui, U-i) and the second query uses memory space {U%, Ui,U^}. Now suppose that users U2 and 
C/5 collude. From the first query, user Ui knows that Ui € {Ui, Us} (regardless of the proxy). From 
the second query, user U5 knows that Ui € {Ui, C/4} (regardless of the proxy). If users U2 and U$ 
collude, then they can identify U\ as the source. 

In general, we can consider a sequence of p linked queries made by the same (unknown) user, 
and a coalition C of at most c users that is trying to identify the source of the p queries. We 
introduce the following terminology. 
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Definition 6.1. Consider a set of p linked queries and fix a maximum coalition size c. If there 
are always at least n users who could possibly be the source ( regardless of the queries and coalition ) 
then we say that the scheme provides (p, c, K)-anonymity. 

Remark 6.2. Of course we want k>2 because the source might be identified if k = 1. 

First, we consider security against a single user (i.e., the case c = 1). Here, it is advantageous 
to use a design with A = 1: 

Lemma 6.1. Suppose the BIBD chosen for Protocol^ satisfies A = 1. Then we achieve (p, 1, k— 1)- 
anonymity for any p. 

Proof. If Ui sees a sequence of p linked queries from the same source, the queries must all involve 
the same memory space, because A = 1. The result then follows from Theorem 16.21 

Remark 6.3. The result of Lemma 16. II does not apply to Protocol [3j This is because in Protocol O 
given a series of linked queries posted to a given memory space, the only user who will never act 
as proxy for one of these queries is the query issuer. This is similar to the projective plane attack 
in [9] that we mentioned in Section [4.1i 

On the other hand, the security of Protocol 2] against a single user might be completely elim- 
inated if we use a design with A > 1. For example, suppose we use a BIBD with A = 2 in which 
every pair of blocks intersects in at most two points (such a BIBD is termed supersimple) . Consider 
two users Ui and Uj. There exist two memory spaces, say S\ and £2, where Si n S2 = {Ui,Uj}. 
Suppose U{ observes two linked queries, say q\ and 92, that involve S\ and S2, respectively. Then 
U can deduce that Uj is the source. 

We now consider some more special cases of this problem, for small values of p and for certain 
special types of designs. This is because, in order to analyze the problem of linked queries, it 
becomes necessary to understand the block intersection properties of the scheme's chosen design. 
In Section 16.21 we consider a more general approach to mitigate this type of attack in the Proxy- 
designated Covering Design Protocols (Protocols [5] and [6|) . 

The case p = 1 (i.e., security against a single query) is easy to analyze: 

Lemma 6.2. We achieve (1, c, k — c — 1)- anonymity in Protocol^ where c < k — 3. In Protocol^ 
we achieve (l,c, k — c)-anonymity, with the requirement that c < k — 2. 

Proof. We first consider Protocol Let C be a coalition of size at most c and let Sh be the memory 
space used for the query q\. Then \C H Sh\ < c. C can rule out as possible sources the users in 
C n Sh as well as the proxy Uj (provided that Uj C n Sh). Since \Sh\(C U {{7j})| > k — c — 1, the 
result follows. An obvious requirement here is c < k — 3. 

For Protocol HI all other users with access to the given memory space can only eliminate them- 
selves as the possible source of the query. This improves the information theoretic security for 
user-privacy against other users, as we now have |S'/i\C| > k — c. An obvious requirement here is 
c < k - 2. 

For the case p = 2, we consider BIBDs with a special intersection property. 

Lemma 6.3. Suppose the BIBD of Protocols and \4\ satisfies the additional property that any 
two blocks intersect in at least p points. Consider two linked queries, q± and q2- Then we achieve 
(2, c, p — c — 2)-anonymity, where c < p — 4, in Protocol^ In Protocol^ we achieve (2, c, p — c)- 
anonymity, with the requirement c < p — 2. 
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Proof. Let C be a coalition of size at most c and let S^i be the memory space used for the query 
qi and be the memory space used for q 2 . Let Ui be the proxy for q\ and let Uj be the proxy 
for q 2 . 

In Protocol El we have 

\(S hl \(C u {[/,})) n (s h2 \(c u {c/,}))| = \(S hl n s, 2 )\(c u {[/,, Uj})\ >n-c-2, 

so we achieve (2,c, p, — c — 2)-anonymity. An obvious requirement here is c < p — 4. 
In Protocol HI we have 

\(S hl \C) n (S ft2 \C)| = \(S hl n 5 fe )\C| > M - c, 

so we achieve (2, c, p — c)-anonymity. Here, an obvious requirement is c < p — 2. 

We can apply Lemma 16.31 to the case of a symmetric BIBD, in which any two blocks intersect 
in exactly A points, as noted in Theorem 13.21 This achieves the following result: 

Corollary 6.1. Suppose the BIBD chosen for Protocols^ and\4\ is a symmetric (v, v, k, k, \)-BIBD. 
Then Protocol provides (2, c, A — c — 2)-anonymity for any c < A — 4 and Protocol [7] provides 
(2, c, A — c)-anonymity for any c < A — 2. 

An interesting extension to the concept of (p, c, K)-anonymity is to consider an average-case 
analysis of privacy against other users. With (p, c, K)-anonymity, we are analyzing the worst-case 
scenario — the minimum level of privacy the scheme achieves against any possible coalition. While 
this is useful in some respects, schemes exhibiting powerful worst-case scenario attacks might actu- 
ally perform quite well against a typical coalition. In particular, if a scheme needs to be concerned 
about random coalitions of users, such an average-case analysis might prove informative, as the 
following example shows. 

Example 6.3. Suppose we use a symmetric (v, v , k, k, 3)-BIBD in any of our P2P UPIR protocols. 
Consider linked queries q\ and q 2 submitted by Ui, with corresponding memory spaces and Sh 2 - 
By Theorem 13.21 since the BIBD is symmetric, we have {S^ H Sh 2 \ = 3. That is, there are exactly 
two other users, say Uj and Ut, in both and Sh 2 - This implies that there is only one coalition of 
users of size 2 that can identify U% as the source. If we consider random coalitions, the probability 

that a random coalition of size 2 consists of {Uj, Ut} is 1 . 

( 2 ) 

Let us consider other coalitions of size 2. Suppose C = {Uj, Ui}, for some user Ui ^ Ut, Ui. Then 
C knows the source is either Ut or Ui. There are v — 3 such coalitions. The analysis for C containing 
Ut but not Uj is similar. If we consider C = {Ug, Ug/} such that Ut, Uj £ C, the most advantageous 
coalition satisfies (without loss of generality) Ue € S^i > Up G Sh 2 ■ In this case, C sees both q\ and 
c/2 and can conclude that the source is one of {Ui, Uj, Ut}. There are (k — 3) 2 such coalitions. Other 
coalitions of size 2 either see only one of {qi,q2}, in which case the analysis reduces to that of 
Theorem 16.11 or 16.21 or neither of the linked queries, in which case C can do nothing. 

6.2 Some Methods to Increase Privacy 

t-anonymity sets Beyond the limited cases described above, it is difficult to analyze the privacy 
guarantees of the proxy-designated BIBD and covering design protocols in the presence of linked 
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queries. In particular, it becomes difficult to analyze the case of intersections of three or more 
memory spaces, and the size of these intersections probably decreases quickly. We might, however, 
wish to provide privacy for p > 2. One possible solution is to introduce the notion of built-in 
permanent anonymity sets for each user. That is, suppose the set of users U is partitioned into 
anonymity sets 7~i, ■ ■ ■ Tg, where each 7} consists of at least t users. We further assume that the set 
system satisfies the property Ted Sj £ {0,7i} for all £, j. We call such a construction a covering 
design with t-anonymity sets. 

Theorem 6.3. Fix a partition T = {71, . . . Tg} of the set of users IA, such that each Tg. consists of 
at least t users. Then we can construct a covering design with t-anonymity sets. 

Proof. We can construct a covering design with t-anonymity sets by the following method. First, 
we construct a covering design on a set of g points, say X = {x\, . . . , x g }. We then define a bijection 
a between the set of g points and the g anonymity sets, so cr{X) = T ■ Finally, for each X£ £ X, 
we replace the point xg by the anonymity set cr(x^) = Tg', where 1 < £' < g. This yields a covering 
design satisfying the desired property. 

Theorem 6.4. Fix a covering design with permanent anonymity sets of minimum size t. Then we 
achieve (p,c,t — c — p)-anonymity in Protocol^ and (p, c,t — c)-anonymity in Protocol^ 

Proof. Let C be a coalition of size at most c and consider a set of linked queries q\, . . . , q p . Let Sh e 
be the memory space used for the query qg and let Uh e denote the proxy for qg, for 1 < I < p. 
In Protocol O we have 

\(s hl \(c u {u hl })) n (s h2 \(c u {u h2 })) n • • • n (s hp \(c u {t^}))| 
= \(S hl n 5 h2 n • • • n s hp )\(c u {u hl ,u h2 ,. . . , u hp })\ >t-c- P . 

In Protocol El we have 

\(S hl \C) n (S h2 \C) n • • • n (S hp \C)\ = \(S hl ns ft2 n-n s hp )\c\ >t-c. 

This completes the proof. 

The idea of using permanent anonymity sets changes the trust requirements of the scheme. In 
particular, Ui must trust the users contained in 71 to a greater extent than users in U\7~i, since 
members of T% necessarily have access to f/j's query sphere. That is, there is no confidentiality 
among members of an anonymity set. 

Query hops Another possible method to increase privacy against other users, which we briefly 
introduce here, involves the introduction of query hops into the protocols. That is, we can consider 
allowing a designated proxy to rewrite a given query to another memory space, rather than simply 
forwarding the query to DB. We can establish a probabilistic approach, such that a designated 
proxy Uj will, with some fixed probability p, forward the query to DB; otherwise Uj rewrites the 
query uniformly at random to one of his associated memory spaces. When a response is received, a 
user simply posts the response back to the memory space where it was read from. This can continue 
until the query response reaches the source. In this case, it is easy to see that on average a query 
is posted 1/p times. This method removes the certainty a curious user has that the source of a 
given query is associated with the memory space in which that query is written. It is an interesting 
problem to analyze the privacy guarantees such a scheme provides against other users. 
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7 Conclusion 



In this paper, we have given an overview and analysis of current research in UPIR, including 
introducing an attack by the database on user privacy. We have established a new model for P2P 
UPIR and considered the problem of user privacy against other users in detail, going well beyond 
previous work. We have given two new P2P UPIR protocols and provided an analysis of the privacy 
properties provided by these protocols. Our P2P UPIR schemes, by taking advantage of the wide 
variety of available combinatorial designs, provide flexibility in the set-up phase, allowing for a 
choice between having a dynamic scheme (in which users are permitted to enter and leave the 
system), or providing increased privacy against other users. Finally, we have pointed out several 
directions for future research in this area. In particular, there is much work to be done regarding 
user privacy against other users, such as moving beyond the worst-case analysis we provide here 
and considering an average-case analysis, as well as the construction of P2P UPIR schemes that 
utilize query hops to mitigate loss of privacy against other users. 
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